Expert's Opinion

How To Spot Vulnerabilities in Skincare Clinical Studies

Industry expert Laurence Dryer PhD provides a guide for beginners.

The most powerful tool of a professional skincare brand to credential the efficacy of its products is the clinical study. Clinical studies and their accompanying Before-and-After photos are the ultimate proof that not only have the formulators collected the right ingredients and developed the right chassis, but that the final goods deliver results. A quality clinical study will generally be published in a reputable peer-reviewed journal, disclose all the specifics of the study, and show numerical and visual proof of efficacy, tolerability and safety. However, the skin care industry is littered with pseudo-publications, “pay-to-play” columns, white papers and case studies—all of them widely presented as evidence.

It is important to remember that although anecdotal data can be part of the initial scientific process, all data should be subjected to the same scrutiny and objective verification. No study is perfect, and good studies point at their own vulnerabilities, but it does help to be able to spot weaknesses in studies that don’t. Here are a few tricks of the trade to detect exposure in an otherwise seemingly solid study.

The Publication Medium: Credibility

Look for the journal, book publisher or website. An obscure journal with a low impact factor may apply less scrutiny; Some journals are not peer-reviewed at all. Book chapters can make excellent review sources, but they are not to be taken as primary literature, i.e., sources of data. The credibility of websites depends entirely on their affiliation: look for sites that belong to research organizations or hospitals instead of purpose- or charity-driven sites. White papers are not reviewed and are potentially biased.

The Study Sponsors or Authors Affiliations: Objectivity

Look for the academic or industrial affiliation of the authors (usually right below their names underneath the study title), or/and for signs of industrial support (usually at the end of the publication) by companies that might benefit from the study results. None of these should invalidate the data by themselves, but they matter, especially if the results are too good to be true. This is not to say the authors falsified data, but rather that the data need to be taken in a larger context and replicated by different sources.

The Statistics: Appropriate Choices

Proper statistical setup is paramount. Although it takes somewhat of an expert eye to determine if the right statistical test was employed, it behooves any reader to at least check that any statistics were done, or if the study was a case study (an in-depth examination of one patient).

The p value is the success indicator of a statistical test. An indicated p value inferior to 0.05 indicates statistical significance and means that there is a 95% chance that the observed results are not due to chance. If there are no statistics, or if the numbers are directional, then the data is not significant at that point for that study.

Statistical tests allow for selective elimination of some data points called outliers. Qualifying data points are generally two standard deviations or more away from the mean of the rest of the data. If appropriately justified, those are fine, but because they can skew the results favorably, they should certainly be scrutinized.

The Before-and-Afters: Cherry-Picking

The lifeblood of a skincare clinical study is the before-and-after photograph. This is what convinces the practitioner and the patient to undergo treatment or purchase a product. Obtaining accurate photographic comparison is extremely difficult without a specialized system that immobilizes the subject and creates controlled and reproducible lighting conditions. Thankfully, such systems exist, but even in the best of conditions, it is hard to define what a “representative” picture is. Do the photographs shown represent an average responder, or the best responder? A good way to spot photos that do not represent the clinical efficacy that should be expected with the product or the compound is to compare it to the numbers from the study. Do the photos show spectacular pigmentation improvement in four weeks, yet the numbers show that only 15% of women show clinical improvement by that time? Those pictures show an overly optimistic version of efficacy.

The Wash-Out Period: Rigged Games

Before a study starts, Good Clinical Practices prescribe a wash-out period. A washout period is a time during which participants stop a previous treatment or product before entering the study. The purpose of a washout period is to ensure that the effects of the previous products do not impact the results of the study. In the context of skin care studies, a wash-out period typically consists of limiting topical use to a non-therapeutic cleanser or soap and a sunscreen.

To standardize the panel of participants, these will often be provided by the authors or the sponsors. Standardization is a sign of a solid study but look carefully at wash-out products that may artificially enhance the apparent efficacy of the product; for example, a soap that happens to be slightly drying during wash-out will increase the likelihood of observing moisturization results.

The Controls: Meaningful Experiments

All good clinical studies should be properly controlled, but not all controls are created equal. Placebo controls are best and most rigorous, and a randomized study that can demonstrate superiority to placebo is considered properly controlled from a product perspective. However, studies of topical products in emulsion (creams, lotions, etc.) forms present the unique challenge of a naturally active vehicle, especially if the study is short, moisture-focused and lacks a regression period. In other words, the chassis or base of the product itself does deliver such a high proportion of the net benefit that a placebo is difficult. In such cases, another control may be more meaningful.

The vast majority of skincare clinical studies are monadic (only one group of participants) and evaluate product efficacy by comparing clinical endpoints before and after treatment, making the participant its own control. In the same vein, a randomized split-face study compares one side of the face where treatment was applied, to the other side, which has either remained untreated or was treated with a placebo or a competing product.

Control groups can also be separate from treatment groups, in which case the study becomes comparative, and this too is an appropriate form of control. There are other forms of study controls, depending on the protocols, but always look for the presence of some form of control, they are a marker of stringency.

Regimens: Efficacy Masking and Confounding Factors

Good Clinical Practices prescribe sunscreen use at all times during a clinical study, but sunscreen is also one of the best anti-aging products on the market, so how can product efficacy be teased out of the result of a protocol that focuses on a core product but also includes a sunscreen? Look for a second group also using the same regimen minus the core product (this would be the equivalent of a placebo group), and a generous number of subjects to compensate for the weakness. There is nothing terribly wrong with the study, just keep in mind that conclusions pertain to the entire regimen, not the individual products.

The Population Selection: Relevancy

A critical bias to look for in studies is population selection. The population must be relevant in size, damage severity and distribution as well. Say, for example, you are reading an anti-aging study, and the protocol ostensibly calls for subjects aged 35-65. If available, look for the demographics of the subjects: is the age evenly distributed? Or are a few subjects in the upper age group and the majority trending young? The answer could invalidate the statistical method and bias the data. Severity (on a photo-aging or any other pathology scale) follows the same rule.

More is not always necessary in statistics. The rule of thumb is that the size of the population be inversely proportional to the sensitivity of the measurement method, i.e., the more sensitive the method of measurement, the fewer subjects are needed for significance, and vice versa. For example, a simple 90-day monadic anti-aging study relying on the dermatologist’s eye needs a minimum of 30 subjects. The eye is a wonderfully holistic instrument, but it is not very quantitative. In contrast, a short study using an artificial precision instrument may only need 10 subjects to extract statistical significance.

Conclusion

No study is perfect, and the skincare industry, with its need for speedy commercialization, certainly has its share of flawed clinical studies, and it is important to examine each study with an appreciative yet critical eye. Knowing a study weakness is the first step in forming an opinion about the validity of product positioning. It is equally important to remember that the scientific process is about creating a collection of data sources and building consensus: Scientific truth is never single-sourced.

About the Author

Laurence Dryer, PhD is a co-founder of Skin Sage Advisors. The private consulting firm offers strategic, scientific and operational expertise to help private equity and brands navigate the complex world of skincare, domestically and internationally. Dryer’s expertise is founded on her prestigious academic career in neuroscience and cemented by her R&D tenures at J&J, The Honest Company and BASF Beauty. Later, at Valeant, Obagi and Clinical Skin, she led scientific, regulatory, clinical and product development strategies. Dryer specializes in scouting innovation, developing and characterizing products, and making science accessible through impactful content. She has dozens of publications and patents.

For more information, click here.